Efficiently Learning Mixtures of Two Arbitrary Gaussians
نویسندگان
چکیده
Given data drawn from a mixture of multivariate Gaussians, a basic problem is to accurately estimate the mixture parameters. We provide a polynomial-time algorithm for this problem for the case of two Gaussians in n dimensions (even if they overlap), with provably minimal assumptions on the Gaussians, and polynomial data requirements. In statistical terms, our estimator converges at an inverse polynomial rate, and no such estimator (even exponential time) was known for this problem (even in one dimension). Our algorithm reduces the n-dimensional problem to the one-dimensional problem, where the method of moments is applied. The main technical challenge is proving that noisy estimates of the first six moments of a univariate mixture suffice to recover accurate estimates of the mixture parameters, as conjectured by Pearson (1894), and in fact these estimates converge at an inverse polynomial rate. As a corollary, we can efficiently perform near-optimal clustering: in the case where the overlap between the Gaussians is small, one can accurately cluster the data, and when the Gaussians have partial overlap, one can still accurately cluster those data points which are not in the overlap region. A second consequence is a polynomial-time density estimation algorithm for arbitrary mixtures of two Gaussians, generalizing previous work on axis-aligned Gaussians (Feldman et al, 2006). ∗Microsoft Research New England. Part of this work was done while the author was at Georgia Institute of Technology, supported in part by NSF CAREER-0746550, SES-0734780, and a Sloan Fellowship. This paper is not eligible for best student paper. †Massachusetts Institute of Technology. Supported in part by a Fannie and John Hertz Foundation Fellowship. Part of this work done while at Microsoft Research New England. ‡University of California, Berkeley. Supported in part by an NSF Graduate Research Fellowship. Part of this work done while at Microsoft Research New England.
منابع مشابه
Sample-Efficient Learning of Mixtures
We consider PAC learning of probability distributions (a.k.a. density estimation), where we are given an i.i.d. sample generated from an unknown target distribution, and want to output a distribution that is close to the target in total variation distance. Let F be an arbitrary class of probability distributions, and let F denote the class of k-mixtures of elements of F . Assuming the existence...
متن کاملLearning Mixtures of Separated non-Spherical Gaussians
Mixtures of Gaussian (or normal) distributions arise in a variety of application areas. Many heuristics have been proposed for the task of finding the component Gaussians given samples from the mixture, such as the EM algorithm, a local-search heuristic from Dempster, Laird and Rubin (1977). These do not provably run in polynomial time. We present the first algorithm that provably learns the co...
متن کاملTraining Mixture Models at Scale via Coresets
How can we train a statistical mixture model on a massive data set? In this paper, we show how to construct coresets for mixtures of Gaussians and natural generalizations. A coreset is a weighted subset of the data, which guarantees that models fitting the coreset also provide a good fit for the original data set. We show that, perhaps surprisingly, Gaussian mixtures admit coresets of size poly...
متن کاملLearning Mixtures of Gaussians
Mixtures of Gaussians are among the most fundamental and widely used statistical models. Current techniques for learning such mixtures from data are local search heuristics with weak performance guarantees. We present the first provably correct algorithm for learning a mixture of Gaussians. This algorithm is very simple and returns the true centers of the Gaussians to within the precision speci...
متن کاملOn Spectral Learning of Mixtures of Distributions
We consider the problem of learning mixtures of distributions via spectral methods and derive a tight characterization of when such methods are useful. Specifically, given a mixture-sample, let μi, Ci, wi denote the empirical mean, covariance matrix, and mixing weight of the i-th component. We prove that a very simple algorithm, namely spectral projection followed by single-linkage clustering, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010